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O . Abstract 

Lost sales inventory models with large lead times, which arise in many practical settings, are notoriously 
[ difficult to optimize due to the curse of dimensionality. In this paper we show that when lead times are 

large, a very simple constant-order policy, first studied by Reiman ([28]), performs nearly optimally. The 
main insight of our work is that when the lead time is very large, such a significant amount of randomness is 
■ injected into the system between when an order for more inventory is placed and when that order is received, 

| that "being smart" algorithmically provides almost no benefit. Our main proof technique combines a novel 

coupling for suprema of random walks with arguments from queueing theory. 

^— * - 

1 Introduction 

In this paper we consider a stochastic inventory control problem under the so-called single-item, periodic- 
review, lost-sales model with positive lead times and independent and identically distributed (i.i.d.) demand. 
This model is based on sales being lost whenever there is insufficient supply to fulfill demand, i.e., unfulfilled 
£T) [ demand is lost rather than being carried over, or backlogged, to a later time. Furthermore, there is a constant 
delay of L > periods (i.e., a single lead time) between when an order for additional inventory is placed and 
when that inventory is received. The problem then is to determine the best policy for a series of orders across 
a planning horizon comprised of a finite number of discrete time periods, with the goal of minimizing cost in 
expectation. 

The cost structure of this model consists of a per-unit penalty for lost sales due to unfulfilled demand within 
each period and a per-unit cost for holding excess inventory within each period. Unlike the corresponding 
backorder inventory control problem when unfulfilled demand is fully backlogged from period to period, where 
the optimal policy is well known to be an order-up-to policy, the optimal order policy for the lost-sales inventory 
[ model is not known in general, and in fact remains poorly understood ([4]). 

Such periodic-review, lost-sales models have a long history in the operations research, operations manage- 
ment and management science literature. Here we briefly review only the most relevant literature, and refer the 
reader to the recent survey paper ([4]) for a more comprehensive exposition. This class of models was first in- 
troduced by Bellman et al. in ([2]). Certain properties of the optimal policy were explored for the case of L = 1 
by Karlin and Scarf ([16]) and by Yaspan ([32]), where it was shown that the order-up-to policy is not optimal 
for the lost-sales inventory model. This analysis was extended to the case of general L by Morton ([23]). Other 
properties of the optimal policy, including various notions of convexity and monotonicity, were explored in 
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([35, 34, 12]). With respect to computation of the optimal policy, the primary approach taken in the literature is 
dynamic programming, combined with various heuristics to speed up computations ([24, 34]). However, since 
the state-space of any such dynamic program grows exponentially in the lead time, roughly on the order of \V\ L 
(where | T>\ is the cardinality of the support of the demand distribution), such computations become extremely 
challenging even for lead times less than 10 ([34]). Namely, this family of techniques suffers from the curse of 
dimensionality as the lead time grows. Indeed, even for a lead time of 4 and geometrically distributed demand, 
Zipkin reports in ([34]) that computing the optimal policy requires solving a dynamic program with 228, 581 
states. This is not surprising because the problem at hand and several closely related problems are known to be 
NP-complete ([10]). 

The difficulty of computing optimal policies for the lost-sales model has led to a considerable body of 
work on heuristics. The computational performance and properties of various algorithms, including order-up- 
to policies, have been analyzed by several authors ([9, 22, 24, 33, 27, 26, 7, 14, 13, 15, 3]). With respect to 
policies that have provable performance guarantees, the breakthrough work of Levi et al. ([18]) proved that a 
certain dual-balancing heuristic, inspired by previous results for other models ([17, 19]), yields a policy whose 
cost is always within a factor of 2 of optimal. Huh et al. ([12]) show that in a certain scaling regime, in which 
the ratio of the lost-sales penalty to the holding cost asymptotically tends to infinity, an order-up-to policy is 
asymptotically optimal; and a similar result has been recently derived by Lu et al. ([21]). Using a very different 
approach, Halman et al. provide an approximate dynamic programming algorithm that, combined with ideas 
from discrete convexity, yields a so-called Fully Polynomial-time Approximation Scheme (FPTAS) for various 
related inventory control problems ([11, 10]). These techniques were recently extended to lost-sales models 
with positive lead time (as considered in this paper) by Chen et al. ([5]), who provide a pseudo-polynomial- 
time additive approximation algorithm. Namely, under a suitable encoding scheme, an algorithm is presented 
that, for any e > 0, returns a policy whose performance differs additively from that of the optimal policy by at 
most e, in time which is polynomial in er 1 if the overall encoding length of the problem is held fixed while e 
is varied, and otherwise is pseudo-polynomial in the overall encoding length (which grows with the lead time 
L); we refer the reader to ([5]) for details. In a follow-up study ([6]), the authors prove several interesting 
integrality results for these and related models. 

The work closest to our own is that of Reiman ([28]), who proposes a very simple policy for a certain 
continuous-review, lost-sales model with positive lead times, in which demand arrives as a Poisson process. In 
particular, the author analyzes an "open-loop" constant-order policy, which at time selects an interval size r 
and simply orders a single unit of inventory every r time units. The author observes that this simple policy can 
be analyzed as a D/M/l queue, and goes on to perform an interesting asymptotic analysis, showing that for 
any fixed holding and lost-sales penalty costs, there exists a critical lead time value L* such that: (i) for all lead 
times less than L* , the best base-stock policy outperforms the best constant-order policy; and (ii) for all lead 
times greater than L* , the best constant-order policy outperforms the best base-stock policy. The author makes 
no attempt to compare either policy to the true optimal policy, which the author notes is unknown. 

Of course, there is no a priori reason to believe that such a simple constant-order policy should be nearly 
optimal. However, numerical results from a recent study ([34]), in which the optimal policy is computed for 
a lost-sales model with i.i.d. demand and positive lead times (nearly identical to the model we consider, but 
with discounting), show that the constant-order policy (in which the same fixed constant is ordered in every 
time period) can perform surprisingly well. More precisely, in numerical experiments for a lead time of 4, the 
constant-order policy always incurs an expected cost at most twice that incurred by the optimal policy; in 62.5% 
of the cases, the constant-order policy incurs a cost at most 1.33 times that incurred by the optimal policy; and 
in 38% of the cases, it incurs a cost at most 1.12 times that incurred by the optimal policy. This begs the 
question of how such a simple policy could perform so well on reasonable problem instances. 

In the present paper we derive theoretical results that shed light on this and related phenomena. Specifi- 
cally, we prove that as the lead time grows (with the demand distribution, lost-sales penalty, and holding cost 
remaining fixed), the best constant-order policy is in fact asymptotically optimal. We also establish explicit 
bounds on how large the lead time should be to ensure that the best constant-order policy incurs an expected 



2 



cost at most 1 + e times that incurred by the optimal policy. To the best of our knowledge, this is the first 
algorithm proven to be within 1 + e of optimal for lost-sales models when the lead time is large, whose runtime 
does not grow with the lead time. The main insight of our work is that when the lead time is very large, such 
a significant amount of randomness is injected into the system between when an order for more inventory is 
placed and when that order is received, that "being smart" algorithmically provides almost no benefit. Our main 
proof technique combines a novel coupling for suprema of random walks with arguments from queueing theory. 
Since this simple policy succeeds exactly when known algorithms start running into trouble due to the curse of 
dimensionality, our results open the door for the creation of "hybrid" algorithms that use more elaborate forms 
of dynamic programming when the lead time is small, and gradually transition to less computationally intensive 
algorithms (with the constant-order policy at the extreme) as the lead time grows. 

The remainder of this paper is organized as follows. Section 2 formally defines the model of study, and 
Section 3 states our main results. We prove a certain lower bound on the performance of any policy in Section 4, 
and in Section 5 we analyze the dynamics of the constant-order policy. In Section 6, we bound the performance 
of a particular constant-order policy using a novel coupling for suprema of random walks, with Section 7 
completing the proof of our main results. Section 8 presents closing remarks and ideas for future research. 

2 Model description and problem statement 

Let us consider the following lost-sales inventory optimization problem. One is given as input a time horizon 
T, lead time L, unit holding cost h, unit lost-demand penalty c, and non-negative demand distribution V with 
unbounded support and finite second moment. The problem is to control inventory in the so-called single-item, 
periodic-review, lost-sales model with positive lead times and i.i.d. demand over a finite time horizon. 

Specifically, consider the following model and associated optimization. Time is slotted, where at the start 
of each time period t there is an amount of inventory P available. There is also an L-dimensional pipeline 
q* = {q\, . . . ,q l L ), which represents the vector of orders placed before period t, but not yet received. The 
dynamics for period t then proceed as follows. First, a new amount q\ of goods is added to inventory. Second, 
before seeing the demand of period t, an order for more inventory is placed. This order must be a function 
(albeit possibly a random function) only of the time horizon T, the current time t, the inventory level at the 
start of period t (/'), the pipeline vector at the start of period t (q*), and the model primitives L, h, c,T>. In 
particular, the ordering decision at time t cannot depend on the realizations of future demand. We call all such 
policies admissible policies, and denote the family of admissible policies by IT. 

The pipeline vector is then updated like a queue: the front entry q\ is removed, and the new order is 
appended at the end. Next, a random demand D l is drawn i.i.d. from V. The inventory is then updated 
according to I t+1 = (P + q\ — D l ) + , noting that D t is independent of I 1 + q\. Of course, some demand may 
be lost. In particular, the amount of demand lost (due to not having enough inventory on hand) in period t is 

denoted by N f = (I 1 + q\ — D*)~. At the end of period t (but before the start of period t + 1) there is a holding 
cost incurred (for storing excess inventory) equal to hl t+1 , and a penalty for lost demand incurred equal to cN l . 

The goal of the planner is to minimize the expected cost incurred over the entire time horizon. In particular, 
supposing 1° = 0, q° = 0, let us define 



Then the planner wishes to find the policy ir € II that minimizes E[£2 t=1 C*], where the expectation is over 
the random demand and any random decisions taken by policy n. For a given policy it, let {iV* , C* , 1^., q^., t = 
1, . . . , T} denote the associated random variables (r.v.s) when policy ir is implemented (all constructed on the 
same probability space). The corresponding lost-sales inventory optimization problem is then given by 



C = hl t+1 + cN l = h{l" + q\ - D l ) + + c(I* +q{- D*)~ ■ 



T 




i=l 
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or equivalently 



3 Main results 



min Y^ E Wi + til ~ Dt ) + + c (4 + ~ DT] • CD 



Our main results show that there exists a very simple constant-order policy which is asymptotically optimal as 
L — > oo. This section formally states these results. 

3.1 Additional definitions and notations. Let D denote a realization from V. Note that if the same deter- 
ministic quantity r < E[D] is ordered in every period, then the inventory evolves exactly as the waiting time 
in a GI/GI/1 queue (initially empty) with interarrival distribution V and processing time distribution (the 
constant) r; we refer the reader to ([1]) for an excellent discussion of the dynamics and steady-state properties 
of the GI/GI/1 queue. Let Z f °° denote a r.v. distributed as the steady-state waiting time in the corresponding 
GI/GI/1 queue. Namely, I^° is distributed as sup k>0 (kr — Yli=i 

Lastly, we define several functions which will be instrumental for our analysis. 

z*(c,h,V) = argmin x>0 ^hE[l^] — cx^j , 

/(c , A , P) 4 s ( (1 - _iga- T ) E[(D - l)+1 ), 

g(c, h, V) = min (hE[[x - D) + ] + cE[(D - x)+] ) , 
x>0 y / 

y cA v(e) ^ max ^2000(1 + /"^(l + E 2 [D] + E[D 2 }f{l + (1 + h'^c + (1 + C - X )hf[l + g' 1 )^ 1 , 
300c 3 E 2 [D}g- 2 h- 1 er 2 
When there is no ambiguity, we will make the dependence of z*, f,g on c, h, V implicit. 



Remarks 

We will later show that z* (c, h, V) is the best constant possible if the same constant amount has to be or- 



dered in every time period. Note that z*(c, h,V) G [0, E[D]), since E[Iq°] = and linv^m] E[l^ 



oo. 

The function /(c, h, V) measures to what extent there exists an x such that E[(x — D) + ] and E[{D — x) + ] 
are both large. Equivalently, / is large if there exists some x such that the expected holding cost incurred 
by having an inventory level of at least x, and the expected lost sales cost incurred by having an inventory 
level of at most x, are both large. Intuitively (under this interpretation), the larger / is, the less benefit 
there is to fine-tuning the inventory on hand, since there are significant costs incurred regardless. For 
this reason, / will be critical in understanding how good the constant-order policy performs, since such 
a policy is an extreme version of not fine-tuning the inventory. 

The function g(c, h, T>) provides a bound (per time period) on the performance of any policy whatsoever, 
since it represents the expected cost incurred in a given period, even if the amount of inventory on hand 
could be chosen when the new demand arrives. For this reason, g will be necessary for stating certain 
bounds, since it provides a convenient method of comparison. 

As we will see, y c ,h,v(e) captures how large L should be so that our constant-order policy is within 
a (1 + e) multiplicative factor of the optimal policy. Note that in addition to depending on the relative 
magnitudes of E[D], E[D 2 ],c, h and their ratios, y c ,h.v{t) is decreasing in /, thus matching our previous 
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intuition. Further note that for all sufficiently small e, the term 300c 3 E 2 [D]g 2 h 1 e 2 will dominate 
y c ,h,v{t), due to its quadratic dependence on e _1 . 

3.2 Formal statement of results. For r G 1Z + , let tt t denote the policy that orders J£° + r in the first time 
period, and r in all later time periods. Let OPT denote the optimal value of the lost-sales inventory optimization 
problem (1), or an appropriately defined limsup if the optimal value is not actually attained. We then have our 
main theorem and an important corollary. 

Theorem 1. . For all e G (0, 1), L > y c ,h,v( e )> and T ^ 2cE[D]g~ 1 er 1 L, 

E [J2t=i c l**. 
OPT 

Corollary 2. . 



< 1 + e. 



lim limsup — L£=LL_L — 2_^. = i 

L^oo j^oo OPT 

In particular, the simple constant-order policy is asymptotically optimal as L — > oo. Although the explicit 
bounds given by y c ,h,v{^) could be improved by a more careful analysis, we believe that the dependence on the 
parameters E[D 2 } and ch^ 1 is fundamental, and leave a tight(er) analysis as an interesting open question. 

4 Lower bound on any policy 

We now derive a lower bound on the cost incurred by any policy 7r € IT during any consecutive L time periods. 
For integers j, k, let equal 1 if j = k, and otherwise. We begin by explicitly characterizing the cost 
incurred under a given policy, proving the following result. 

Lemma 3. . For any policy ir G II and time r G [1, T — L], 

T+L—l L r , k 

e[ Y = h J2 E m n a \( E (<i-- DT+i_1 )+<M 

'—^ j=U,...,/c \ ^— -" ' 

t=T k = l 1 ^i = k + l-j 



+c(e[II +l \^ II] -11 + LE[V] - Y • 
^ 1=1 ' 



Proof. 

It follows by a simple induction that for any k G [I, L], 



i: +k = max ( Y, {^i-D^-^ + S^A. (2) 

*2-l n t _ 



Note that for any times t\ < ti, the net amount of demand that is met during [t±, t2 — 1] equals ^2 t 2 =tl D 



Y^t=t' K- 11 follows that 

t 2 -l t2-l *2-l 

t=t\ t=t\ t—t\ 

Furthermore, for any times t\ <t2, 

XX = # +1 -# + 3>*-XXi- (4) 

t=t\ t=t\ t=t\ 

Combining (2), (3) and (4) together with the fact that x = <z* _ jf +1 , for any k G [0, L — 1], completes the proof. 
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We next construct a lower bound by computing the cost that a policy incurs over L consecutive time periods 
if the policy can choose the state of the system at the start of those L time periods to be as favorable as possible. 
Let (q* , /* ) denote any solution to the optimization problem 

T + L-1 

qeM+ L ,xeiR+ f— ' 

t=T 

where the existence of (q*, /*) follows from the fact that E[J2t=r 1 C^IQtt = Q> ^ = * s continuous with 
respect to (q, X) and goes to infinity as (q, X) goes to infinity, combined with a routine compactness argument. 
Note that, without loss of generality, we can take (q* , I*) to be independent of the particular policy it and the 
particular value of r, so long as r G [1,T — L]. Upon conditioning on {q* = q*, J* = I*}, the conditional joint 
distribution of {Z* + , N*, C^,t = 1, . . . , L} does not depend on the particular policy 7r, and thus we denote 
these conditioned r.v.s as iV*, C*,t = 1, . . . , L}. It then follows from Lemma 3, since {D l 7 i > 1} are 

i.i.d., that for any n G II and t G [1, T - L], 

-L-l 



E[jrci] > ^X>[.maxf £ q ^ + 5 jjk I*) 

t=T k = l ^ \ i=k + 1 _j i = l / 

+c{e[I^ +1 ] - h + LE[D] - Y, • 



To avoid problems "at the boundary" (e.g. ordering an abnormally large amount of supply near the end of 
the time horizon), let us fix some integer V G [1, L] and define q*' to be the vector whose first L' components 

are identical to those of q*, but whose final L — L' components are all set to zero. Additionally, let I*/ = I*. 
We denote the associated conditioned r.v.s as {I^f 1 , N*, , C\, , t = 1, . . . , L}, noting that the joint distribution of 
{lit 1 , -/V*/, t = 1, . . . , L'} is identical to that of {I* +1 ,iV*,t = 1, . . . , L'}. Since q* > q*/, we may construct 
TV*, I l J~ l ,N*,,t = 1, . . . , L} on the same probability space such that, with probability 1, I*, < l\ for 
t G [1, L + l] and 2V* = iV*, for i G [1, I/]. By combining the above with the fact that £[iV*,] < £?[£>] for all 
t, we obtain the following result. 

Lemma 4. . For any tt G II ant/ r G [1, T — L], 

T+L-1 L \ / k •* V 

e[ E <%] > h Y. E ™ ax fc ( E ^-E^+^fc 1 *') 

+c(e[I^ 1 } - I*, + LE[D] -Y<l*',i ~ E i D ]( L ~ L ')) ■ 
^ i=l ' 

5 Constant-order policy dynamics 

We next explicitly describe the cost incurred by the policy 7r r , which orders J£° + r in the first time period 
and r in all later time periods. As previously noted, if the same deterministic quantity r is ordered in every 
period, the inventory evolves exactly as the waiting time in a GI/GI/1 queue (initially empty) with interarrival 
distribution V and processing time distribution (the constant) r. Further recall that J£° denotes a r.v. distributed 
as the steady-state waiting time in the corresponding GI/GI/1 queue, i.e., I£° is distributed as sup fc>0 (£;r — 
Ym=i It tnen follows that {I^ k , k > 2} is a stationary sequence of r.v.s, with I%r~ k distributed as for 
all k>2. 

Let I^\ denote a particular realization of such that J£° and {D\i > 1} are mutually independent. 
Define 

3 



W k = max I jr 

j=0,...,k 



E^ + ^/ r °°i), 
t=i ' 
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i k =\ sup {f : j* € [0, fc], j*r -J2 Di + 5 r^i = W r k }• 

i=l 

In words, is the (largest) index at which the random walk W k attains its supremum. 
From Lemma 3 and the fact that {D\ i > 1} are i.i.d., we have the following result. 

Lemma 5. . For any r < E[D] and r £ [L + 1, T — L], 

r+L-l L iy 

E[ Y, CU =hJ2 E\i k r r - J>* + + <LE[V) - Lr) . 

t=T k=l 1 = 1 

Let us define 

j 

W™ = msx{jr-Y, Di ), 

°- i=i 

and 

3* 

,,•00 A 



sup{f:j*>0,j*r-J2 Di = W™}. 



i=i 

We now characterize the distribution of i k as follows. 

Lemma 6. . For all k > 1, i k has the same distribution as min(/c, 

Proof. 

Let {D h ,i > 1} be an additional sequence of i.i.d. realizations of D, mutually independent from {D % ,i > 
1}. Then for any k > 1, we may construct W k , {D l ,i > 1} on the same probability space such that 



i=l 

and 



W k = max fir-V^ + ^fc/riY 

j=o,...,fc v ^ y 



Furthermore, on this probability space, we have 

W k = 



max [ jr -~S^ D l + 6~ k max(7r - D n ) ) 

E ^) > - E ^ + 7> a x (V - E D * 

i=l ' i=l ^ i=l 

E ^) > -> a x (0 +^-(E^+E^)))- (5) 

i=l / ■ ? - V i=1 j =1 / / 



max max ?r 
I j=o,...,fc-i 1 



max max nr 
\ j=o,...,fc-i 1 J 



Observe that the joint distribution of all terms appearing within the max operator in (5) is unchanged if we 
replace £| =1 D H by D k+i for all j. Given that 

max ( + fc)r - (^Z> fe+i + E £)i ) ) = max (0' + fc ) r - E ^ 
J -° ^ i=i «=i ' - 7 - i=i 



max (jr-^Z) 1 ), 

' y - 8=1 
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we may construct W*, i^, {D 1 } on the same probability space such that 



Wh = max I max 



(jr-J^D^m^^jr-J^D^y (6) 



j=0,...,k-l 

with ij. equal to the largest index at which the maximum in (6) attains its supremum, if this index is at most 
k — 1, and 1% equal to k otherwise. The lemma then follows. ■ 

We conclude this section by expressing the cost incurred by the constant-order policy ir r in terms of the r.v. 
and use this representation to show that corresponds to the "best-possible constant". 

Lemma 7. . For any r G [0, E[D]) and t > L + 1, 

E[Cir]=hE[I™] + c(E[D]-r). (7) 

Proof. Note that E^t 1 } = E[I™], since {i£r,t > L + 2} is a stationary sequence of r.v.s. Similarly, 
E[Nl r } = E[D] - r, which follows from (4), combined with the fact that E[I* r ] = E^t 1 ] for all t > L + 2. 
Combining the above completes the proof. ■ 

As a corollary, we find that is indeed the "best constant" among all constant-order policies. In particu- 
larly, by minimizing the r.h.s. of (7) w.r.t. r, we conclude that 

Corollary 8. . For any r G [0, E[D}), E[J2j =1 C^] > <4*. ]• 

6 Relating the lower bound to the constant-order policy 

In this section we bound the difference between the expected performance of a particular constant-order policy 

and that of a general policy ir. Define r* = L _1 Yld=i Q*,i an ^ r *' = L~ 1 J2i=i ?*',*• Note that both r* and 
7V can roughly be interpreted as the average (over the next L periods) amount that one would choose to have 
arrive in each period, if given the opportunity to optimize over all possible starting inventories (see Section 4). 

Theorems . Ifr*> < E[D), then for any tt G II and r G [L+1,T-L], E\£%£~ 1 ^]"- s ES'" 1 C V\ 
is at most 

oo L 

hL n , p (C - j) +' i E% + cI *' + cE [ d k l - L ')- 

j=(L-L>) k=l 

Proof. 

Fix some r G [0, E[D]). For k G [1,L], let us construct 

/ fc J \ 

X fc = max I ^ - ^ L> 1 + 5,^1*' J 

j-0,..., \ i=k+1 _ j i=1 / 

on the same probability space as and i^, using the same sequence of demands {D l , i = 1, . . . , L}. Since 
the maximum of several terms is at least any one of the terms (even if selected randomly in an arbitrary manner), 
it follows that w.p. 1, for k G [1, L], 

k ir 

i=fc+l— i* i=l 
Combining with Lemma 4, we conclude that 

r+L-1 L r k i% -i 

t=r k=l L i=fc + i_jfe i=l 

L 

(9) 



^ i=l ' 
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For an integer i and set of integers S, let S^s = 1 if i G S, and otherwise. Observe that 

k k k 

^ Q*',i = "^2q*',idi,lk+l-i*:,k} = ^g* / ,A*,[jfc+l-i,c»)i 
j = fe4-l_jfc i=l i=l 

and thus by interchanging the order of summation, we obtain 

L k L L 

k=l i=fe+l— i* i=l fe=i 

Combining this with (8) - (9) renders 

t+L-1 / L L 

t=r i=l k=i 

L ir L 



-X>[E^+^X>(£ = *)) 

k=l i=l k=l ' 

+c(e[I^ +1 ] - I*, + LE[D] - J]?*',* " " L ')) 

^ i=l ' 



By a simple rearrangement of indices, we have 

L L+l-i 
E F (*r > k + 1 - i) = Y Hir +i ~ l > k)- 

k=i k=l 

Moreover, for alK > 1, 

P(min(i*,fc + i-l) > k) = P(i~ > k), 

and thus Lemma 6 yields 

L+l—i L+l-i 

Y PC*? 4 * -1 > fc) = E P(C > fc). 

k=l k=l 

Since P(i* = k) = P(i£° > k), it then follows from (10) - (12) that 

t+L-1 / L L+l-i 

e[ y ci\ > E p(v°°>^) 

t=r ^ i=l k=l 

L % 



fe=l i=l fc=l ' 

+c(e[I$ +1 } - I,, + LE[D] - £[£>](£ - L')) 

^ i=i ' 



This together with Lemma 5 shows that #EI=t ^1 ~ ^E[=r ^1 is at most 

/ L L L L+l-i L 

^ fe=l fe=l i=l k=l k=l 



^ i=l ' 



We next apply the above with r = r*/, noting that for any r < E[D], 

oo 

E[v fe ] < £[v°°] = X>(v°°>i)- 



Hence, 



L L L oo 

fc=l fc=l i=l i=l 

L oo 



i=i fc=i 

from which it follows that r*/ ^Li ~ Et=i <?*',* SK"* P(*v > fc ) is at most 

L oo L' oo 

£ P(C>i) = E P(^>i) 

i=l j=L~i+2 1=1 j=L-i+2 

L' oo 

< £ p(*->j) 

i=l j=(L~L>) 
oo 

= Lrv £ P(»v>i). 
j=(L-L') 

Combining the above with the fact that Lr*/ = ^»=i Q*',i> an( ^ tne non-negativity of all relevant terms, com- 
pletes the proof. ■ 

7 Proof of main result 

We now complete the proof of our main result, namely Theorem 1. A few key lemmas that will be useful for 
this purpose are introduced first. We then derive some bounds for various quantities that appear in Theorem 9. 
Lastly, we establish the desired upper bound as a function of L, and explicitly characterize its magnitude for 
large L. 

7.1 Key Lemmas. Let us begin by defining = ir^wjp^min . based upon which we establish the follow- 
ing set of upper bounds. 

Lemma 10. . For any r < E[D] and k > 0, 

P(t~ = fc) < (i-e) fc , 
p(C > k) < @- 1 (i-e) k , 

oo 

X>(^>i)<©~ 2 (i-e) & > 

j=k 

and 

E[(I™) 2 ] < 2@- 3 E 2 [D]. 
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Proof. 

By definition, F(i™ = k) < P( J2i=i( r ~ Di ) ^ °)- A PP!y m g a Chernoff bound, we find that for any 

e > o, 

=k)< E k [exp(6(r- D))], 

where 

E[exp (6(r - D))] = exp(0r)£[exp ( - OD)] 

< exp(6r)E[(l + OD)' 1 ] (since exp(x) > 1 + x) 

the final inequality following from a simple Taylor-series expansion. However, with probability 1, we have 

1 + Qr + 9 2 r 2 = + _^f r 2_ D r r _ D) ) 
1+6D v ' l+6D y v " 

< l + Q{r-D) + 6 2 {r 2 + D 2 ), 



and thus 



Observing that 



E[exp (9(r - D))] < 1 - 6(E[D) - r) + 9 2 {r 2 + E[D 2 )). 



E[D] - r E[D] 



2{r 2 + E[D 2 }) ~ 2E[D 2 ] 



E[D] _ 1 _j 
- 2^ 2 [D] ~ 2E[D] < T ' 



we may take 9 = 0* = 2 (!^+e[d' 1 }) to conc l u de 



E[exp{6*(r-D))] < 1 ^ ' ' r> 



4 ( r 2 + ^[£,2]) 

(EfDl - r) 2 
< 1 — — = 1-0 

A{E 2 [D] + E[D 2 ]) 

where the final inequality follows from the fact that r 2 < E 2 [D] . Combining the above with the basic manip- 
ulation of a few geometric series, and the fact that J£° < ri™ < E[D]i^° with probability 1, completes the 
proof. ■ 



Remark. We note that a more precise analysis of the quantities in Lemma 10 would be possible using the 
theory of ladder heights and epochs ([1]), especially the precise results for the relevant moments given in ([29, 
30, 8, 20]) and the recent work in ([25]). However, since the increments of the random walks that we consider 
have a very special structure (i.e. they are absolutely bounded from above), as well as for the sake of simplicity, 
we do not pursue such an analysis here. 

Next, we prove bounds on the expected values of several variables of interest. 
Lemma 11. . For all V < L — (\(2ch- 1 L)^] + 2), 

E[I?]<(\(2ch- 1 L)*]+2)E[D]. 



11 



Proof. 

Suppose for contradiction that E[I?\ > ( \{2ch~ 1 L)^ + 2)E[D\. Observe that E[I* +1 ] > E[I*] - E[D] 
for all k, from which it follows that E[I^' +k ] > E[D\{\{2chr 1 L)*'\ +2- it) for all k € [1, \(2chr 1 L)v]+2]. 
The resulting holding costs ensure that J2t=L' E[Cl] is strictly greater than 

hE[D] k ^ cE[D]L. 

k=i 

However, since the policy that starts with no inventory and orders nothing over the entire time horizon incurs 
only a cost of cE[D]L throughout the horizon, the optimality of (q*, I*) leads to a contradiction, which proves 
the desired bound. ■ 



Lemma 12. . 

L 

^E[Nl]>fL. 

t=i 

Proof. 

Fix some x > such that E[(x — D) + ],E[(D — x) + ] > 0, and define 

p*,t = Hii + q*,t>x)- 

Then, from the inventory dynamics and independence structure, for all t £ [1, L] we have 

E[I t + 1 ]>p x>t E[(x-D)+]. 

The optimality of I* , q* ensures 

L 

hY,E[ll +1 ]<cLE[D], 
t=i 

from which it follows that 

Ed -p,.») >(i- i^ )t. 

However, 

E[Nl) > (I - Px , t )E[(D - x) + ], 

thus proving the desired bound. ■ 

Finally, we bound E[D] — r*i away from zero, specifically establishing the following result. 

Lemma 13. . For all V < L — ([(2c/i _1 L)l] + 2), we have 

E[D]-r*> > f - L- 1 (\(2ch- 1 L)h] + 2)E[D}. 

Proof. 

Suppose L'e[l,L- (\(2ch- 1 L)^] +2)]. Then (4) implies 

L L 



1=1 t=l 
It follows from the construction of g*/^ that 
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and 

L L 

t=l t=l 
Combining the above with the definition of ry, we obtain 

L 

Lr*, < E[I?] + LE[D] - £ E[N*\. 

t=i 

The desired bound then follows from combining the above with Lemma 1 1 and Lemma 12. ■ 

7.2 Completion of the Proof. With Lemmas 10 - 13 in hand, we now complete the proof of Theorem 1. We 
first bound the various terms appearing in Theorem 9, leading to an expression that explicitly depends on L. 
Then we precisely identify the magnitude of L, which provides the desired result. 

Proof. [Proof of Theorem 1] 

Let L' = L - ( f(2c/i _1 L)5] + 2) , tt G IT be an arbitrary policy, and r G [L + 1, T - L + 1] an arbitrary 
time. Note that 

\{2ch~ 1 L)^'\ + 2 < (2c/i~ 1 )2L5 + 3. 
Now, let us fix some 5 G (0, 1) (to be specified later). It is easily verified that, if L is at least 

ij) = max (8E 2 [D]5^ 2 ch^ 1 f^ 2 , QE[D)5~ l f~ l ) , 

then 

L- 1 (\(2ch- 1 L)^] + 2)E[D] < 6f, 

and thus by Lemma 13, 

E[D] - rv > (1 - S)f. (13) 
We now bound each of the terms appearing in Theorem 9. To bound the first term, define 

r, i 4(1 - 6)- 2 (l + E 2 [D] + E[D 2 })(1 + r 1 ) 2 , 
and note that (13) implies that (defined before Lemma 10) is at least r/ -1 . Then by Lemma 10, we have 

OO 1 

/iLrv P (C ^ /^[^(l-rT 1 )^" 1 ^ 

J=(£-£') 

< hLE[D]r] 2 exp ( - (2c/i" 1 L)5^ 1 ) . 
For the second term, from the Cauchy-Schwartz inequality, Lemma 6, Lemma 10, and (13), we obtain 



fe=l 

L 

= hj2^K >k)Eh[(i^f] 

k=l 

L 

< h2^E[D}ri 2 J2( 1 -V~ 1 )^ k 

< h2^E[D]r] 2 



k=l k=l 

L 



k=l 



1 - (1 - tt 1 ) 2 
< h2^ E[D]i] 2 (2ri) = 2^hE[D\rf. 
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Applying Lemma 1 1 to i* renders a bound for the third term: 

cl*' < c^[L»]((2c/i _1 L)5 +3). 
Finally, the fourth term can be easily bounded as follows: 

cE[D](L-L') < cE[D]((2ch~ l L)^ + 3). 

Combining the above bounds with Theorem 9, we find that ifL>ip, then ^E[=^ _1 C*-v \ ~ E [Td=T ~* C V\ 
is at most 

k = hLE[D]rj 2 exp ( - (2ch~ 1 L)^ rj" 1 ) + 2^hE[D]rf + 2cE[D]((2ch~ 1 L)3 +3). 

Recall that g = min^o (/i-E?[(a; - D)+] + aE[(D - »)+]), and thus J5[C£] > 5 for any i G [1,T]. It then 
follows that 



<l + Kg~ x L~\ (14) 



The intuition for the remainder of our argument is that the right-hand side of (14) is of the order 1 + 

0(e~" /Z ) + 0{L- V ) + 0(L~ x l 2 ), and can thus be bounded by 1 + e for all "sufficiently large" L. We 

now formally complete the proof, by demonstrating that if S = .091 and L > max rift, y c ,/iX'(2e)), then 

•E[E[i i " 1 c* r ,] , , 

^ * < 1 + e. We proceed by bounding Kg L term-by-term, and start by establishing that 

5 - 1 /i J E[D]r/ 2 exp( - (2c/i- 1 L)^r ? - 1 ) < ~. (15) 
Indeed, observe that (15) holds if 

L > ^c^hr] 2 log 2 (3(1 + 5 _1 )(1 + h) max(l, E[D])r] 2 e' 1 ) . (16) 

Since 3?? 2 > 12, and given the fact that log 2 (x) < for all x > 5, it follows that the right-hand side of (16) 
is at most 

1 

3 2 _1 , 1, ,,1/ ^r^iN\i _I 

2 



yc _1 /n) 3 (l + 5 _1 )5(1 + (max(l, £[£>])) 5 e 

1 

= ^ . 64 • (1 - .091)- 6 • (1 + E 2 [D] + E[D 2 ]) 3 (1 + /-^c^l + ^^(l + h)* (max(l, E[D]))% e~* 
< 100(1 + f- 1 f(l + E 2 [D} + E[D 2 }) 3 -^l + (l + h- 1 )c + (l + c- 1 )h)^l + g- 1 )h- 1 2 < j/ cA z>(2e), 



demonstrating (15). 
We next show that 

g- x L- x 2%hE\D\r? < |, (17) 

which holds if 

£ > 3 • 2^g~ 1 hE[D]r^e~ 1 . (18) 
Observing that the right-hand side of ( 1 8) equals 

3 • 2§ • 4 3 • (1 - .091)- 6 (1 + E 2 [D] + £[D 2 ]) 3 (1 + r l fg~ l hE[D)e- 1 
< 1000(1 + r 1 f(l + E 2 [D}+E[D 2 }) 4 (l + (l + h- 1 )c+(l + c- 1 )h)(l+g- 1 )e- 1 < y c ,^(2e), 

completes the proof of (17). 
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Finally, we establish that 

g- l L- l 2cE[D]{{2ch- 1 L)^ +3) < J. (19) 

This holds if for any 7 G (0, §), we have ^L-^aEt-Dpc/T" 1 !^ < 7 e and S^L^c^-D] < {\ - i)e, 
which itself holds if L is at least 

max(8-f- 2 c 3 h- l g- 2 E 2 [D]e~ 2 ,6(^--fy 1 g' 1 cE[D]€r 1 ). 

Letting 7 = | • we find that 8 7 ~ 2 = 75, and 6(| - 7) _1 < 1000, from which (19) is easily seen to 

follow. Combining the above, we conclude that if L > max (ip, ?/c,/i,x>(2e)), then the right-hand side of (14) is 
at most 1 + e. Furthermore, since 8 • .091~ 2 < 1000, it is easily verified that y c ,h,T>{^) > V'- It follows that if 

L > ychvfa), then J +L _ 1 < 1 + e. Since ir was general, this demonstrates that for L > y c ,h,i>(2e), 
over any consecutive L periods, the cost incurred by the policy ir r *' is within a multiplicative factor of 1 + e of 
optimal. 

Let ir OPT be an optimal policy, i.e., a solution to the problem (1), or an appropriately defined subse- 
quential limit if such a policy is only approached. Then we may combine the above analysis with Corol- 
lary 8, and the fact that: £E^i (i,T) C*»,] = E[J2?li (L,T) C\ OPT \ = cE[D] min(L,T), OPT > gT, and 
Ylt=L[^\+i E[Ciz t ] < cE[D]L (which follows from Corollary 8 applied with r = 0), to conclude that 

E[Y,LiC^] cLE[D] 
OPT - gT ' 

Replacing e by | in the above, and enforcing cL ^[ D l < | ; completes the proof. ■ 
8 Conclusion 

In this paper, we considered the single-item, periodic-review, lost-sales model with positive lead times and i.i.d. 
demand, for which the optimal policy is poorly understood and computationally intractable. We proved that, 
as the lead time grows (with the demand distribution, lost-sales penalty, and holding cost remaining fixed), a 
simple, open-loop constant-order policy is in fact asymptotically optimal. We also established explicit bounds 
on how large the lead time should be to ensure that the best constant-order policy incurs an expected cost at 
most 1 + e times that incurred by the optimal policy. To the best of our knowledge, this is the first algorithm 
proven to be within 1 + e of optimal for lost-sales models when the lead time is large, with a runtime that does 
not grow with the lead time. Our main proof technique involved a novel coupling for suprema of random walks, 
and may be useful in other settings. 

This work leaves many interesting directions for future research. We suspect that our explicit bounds are not 
tight, and a more precise analyis of the performance of the constant-order policy would go a long way towards 
helping explain the good performance of the algorithm for lead times as small as 4, as reported in ([34]). Since 
lost sales models commonly arise in practice, an interesting challenge is to combine the core ideas of our analy- 
sis with known results from dynamic programming to derive and analyze practical "hybrid" algorithms, which 
use more elaborate forms of dynamic programming when the lead time is small, and gradually transition to less 
computationally intensive algorithms (with the constant-order policy at the extreme) as the lead time grows. It 
would also be interesting to prove that a similar phenomenon occurs in other inventory models. Indeed, the 
message of our paper falls under the broad heading of "long-range independence / decay of correlations" phe- 
nomena. Such ideas have led to significant progress on models in other fields (e.g. [31]), and may prove useful 
in other operations management problems. 
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